{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "cellView": "form",
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "a0kDBAHvP7jw",
    "outputId": "f0375e07-7a6a-4a30-b7da-19f3bc02b421"
   },
   "source": [
    "License\n",
    "\n",
    "```\n",
    "Copyright (c) Facebook, Inc. and its affiliates.\n",
    "\n",
    "This source code is licensed under the MIT license found in the\n",
    "LICENSE file in the root directory of this source tree.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "AidRbcu8Pwxh"
   },
   "source": [
    "# CompilerGym Getting Started\n",
    "\n",
    "CompilerGym is a toolkit for applying reinforcement learning to compiler optimization tasks. This document provides a short walkthrough of the key concepts, using the codesize reduction task of a production-grade compiler as an example. It will take about 20 minutes to work through. Lets get started!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "SlTQST1TT2uf"
   },
   "source": [
    "## Key Concepts\n",
    "\n",
    "CompilerGym exposes compiler optimization problems as environments for reinforcement learning. It uses the [OpenAI Gym](https://gym.openai.com/) interface to expose the “agent-environment loop” of reinforcement learning:\n",
    "\n",
    "![overview](https://facebookresearch.github.io/CompilerGym/_images/overview.png)\n",
    "\n",
    "The ingredients for reinforcement learning that CompilerGym provides are:\n",
    "\n",
    "* **Environment**: a compiler optimization task. For example, *optimizing a C++ graph-traversal program for codesize using LLVM*. The environment encapsulates an instance of a compiler and a particular program that is being compiled. As an agent interacts with the environment, the state of the program, and the compiler, can change.\n",
    "* **Action Space**: the actions that may be taken at the current environment state. For example, this could be a set of optimization transformations that the compiler can apply to the program.\n",
    "* **Observation**: a view of the current environment state. For example, this could be the Intermediate Representation (IR) of the program that is being compiled. The types of observations that are available depend on the compiler.\n",
    "* **Reward**: a metric indicating the quality of the previous action. For example, for a codesize optimization task this could be the change to the number of instructions of the previous action.\n",
    "\n",
    "A single instance of this “agent-environment loop” represents the compilation of a particular program. The goal is to develop an agent that maximises the cumulative reward from these environments so as to produce the best programs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "MBiwH2xDUiy-"
   },
   "source": [
    "## Installation\n",
    "\n",
    "Install the latest CompilerGym release using pip:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "CUKVhcr2P0Ui"
   },
   "outputs": [],
   "source": [
    "!pip install compiler_gym\n",
    "\n",
    "# Ignore this code block:\n",
    "#\n",
    "#   Explanation: to keep the size of the CompilerGym package down\n",
    "#   most environments split out their large data dependencies into\n",
    "#   separate archives which must be downloaded and cached locally\n",
    "#   the first time you run an environment. Here we are doing this\n",
    "#   for the LLVM environment so that the rest of the code cells in\n",
    "#   this document are nice and snappy.\n",
    "import compiler_gym\n",
    "with compiler_gym.make(\"llvm-autophase-ic-v0\") as env:\n",
    "    env.reset()\n",
    "# End of ignored code block."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "CaRZ_tt-Uqrx"
   },
   "source": [
    "See [INSTALL.md](https://github.com/facebookresearch/CompilerGym/blob/development/INSTALL.md) for alternative installation methods."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "J6hV1NlKQdN2"
   },
   "source": [
    "## Using CompilerGym\n",
    "\n",
    "To start with we import the gym module and the CompilerGym environments:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Qw0VakHSQe5J"
   },
   "outputs": [],
   "source": [
    "import gym\n",
    "import compiler_gym\n",
    "\n",
    "compiler_gym.__version__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "bXAmlDsUQ-B9"
   },
   "source": [
    "Importing `compiler_gym` automatically registers the compiler environments.\n",
    "\n",
    "We can see what environments are available using:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "hINZesIARAXT"
   },
   "outputs": [],
   "source": [
    "compiler_gym.COMPILER_GYM_ENVS"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "zKnNlIpXRAzF"
   },
   "source": [
    "## Selecting an environment\n",
    "\n",
    "CompilerGym environments are named using one of the following formats:\n",
    "\n",
    "* `<compiler>-<observation>-<reward>-<version>`\n",
    "* `<compiler>-<reward>-<version>`\n",
    "* `<compiler>-<version>`\n",
    "\n",
    "Where `<compiler>` identifiers the compiler optimization task, `<observation>` is the default type of observations that are provided, and `<reward>` is the reward signal.\n",
    "\n",
    "**Note** A key concept is that CompilerGym environments enables **lazy evaluation** of observations and reward signals. This increases computational efficiency sampling for scenarios in which you do not need to compute a reward or observation for every step. If an environment omits a `<observation>` or `<reward>` tag, this means that no observation or reward is provided by default. See [compiler_gym.views](https://facebookresearch.github.io/CompilerGym/compiler_gym/views.html) for further details.\n",
    "\n",
    "For this tutorial, we will use the following environment:\n",
    "* **Compiler**: [LLVM](https://facebookresearch.github.io/CompilerGym/llvm/index.html).\n",
    "* **Observation Type**: [Autophase](https://facebookresearch.github.io/CompilerGym/llvm/index.html#autophase).\n",
    "* **Reward Signal**: [IR Instruction count relative to -Oz](https://facebookresearch.github.io/CompilerGym/llvm/index.html#codesize).\n",
    "\n",
    "Create an instance of this environment using:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "peG5Jp_bRtTu"
   },
   "outputs": [],
   "source": [
    "env = gym.make(\"llvm-autophase-ic-v0\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note** The first time you run `gym.make()` you may see a logging message \"Downloading \\<url\\> ...\" followed by a delay of 1-2 minutes. This is CompilerGym downloading large environment-specific dependencies that are not shipped by default to keep the size of the package down. This is a one-off download that occurs only the first time the environment is used. Other operations that require one-off downloads include installing datasets (described below)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "xwc2xlJTSPSd"
   },
   "source": [
    "## The compiler environment\n",
    "\n",
    "If you have experience using [OpenAI Gym](https://gym.openai.com/), the CompilerGym environments will be familiar. If not, you can call `help()` on any function, object or method to query the documentation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "INSRF2LmSVV5"
   },
   "outputs": [],
   "source": [
    "help(env.step)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ci0Pc-81SWMh"
   },
   "source": [
    "The action space is described by `env.action_space`. The [LLVM Action Space](https://facebookresearch.github.io/CompilerGym/llvm/index.html#action-space) is discrete:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.action_space.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.action_space.n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The observation space is described by `env.observation_space`. The [Autophase](https://facebookresearch.github.io/CompilerGym/llvm/index.html#autophase) observation space is a 56-dimension vector of integers:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.observation_space.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.observation_space.dtype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The upper and lower bounds of the reward signal are described by `env.reward_range`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.reward_range"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As with other Gym environments, `reset()` must be called before a CompilerGym environment may be used:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.reset()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The numpy array that is returned by `reset()` is the initial observation. This value, along with the entire dynamics of the environment, depend on the particular program that is being compiled. In CompilerGym these programs are called **benchmarks**. You can see which benchmark is currently being used by an environment using `env.benchmark`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.benchmark"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we want to compile a different program, we can pass the name of a benchmark to `env.reset()`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.reset(benchmark=\"benchmark://npb-v0/50\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We provide over [a million benchmarks for the LLVM environments](https://facebookresearch.github.io/CompilerGym/llvm/index.html#datasets) that can be used for training agents and evaluating the\n",
    "generalization of strategies across unseen programs. Benchmarks are grouped into *datasets* , which are managed using `env.datasets`. You may also provide your own programs to use as benchmarks, see `env.make_benchmark()` for details."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "4AbDyVS2SZJ5"
   },
   "source": [
    "## Interacting with the environment\n",
    "\n",
    "Once an environment has been initialized, you interact with it in the same way that you would with any other [OpenAI Gym](https://gym.openai.com/) environment. `env.render()` prints the Intermediate Representation (IR) of the program in the current state:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Ks6hQobrSi8x"
   },
   "outputs": [],
   "source": [
    "env.render()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "lFyM5PAMSgP3"
   },
   "source": [
    "`env.step()` runs an action:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "WiS3lXSeSlQW"
   },
   "outputs": [],
   "source": [
    "observation, reward, done, info = env.step(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "fe4LYhV7SnWp"
   },
   "source": [
    "This returns four values: a new observation, a reward, a boolean value indicating whether the episode has ended, and a dictionary of additional information:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Yy1FrFQxSrf0"
   },
   "outputs": [],
   "source": [
    "observation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Z90NAUVPSsNv"
   },
   "outputs": [],
   "source": [
    "reward"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "9EnM69y0SsxW"
   },
   "outputs": [],
   "source": [
    "done"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "QI2uTajIStC9"
   },
   "outputs": [],
   "source": [
    "info"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "8ZMmVE0hStV0"
   },
   "source": [
    "For this environment, reward represents the reduction in code size of the \n",
    "previous action, scaled to the total codesize reduction achieved with LLVM's `-Oz` optimizations enabled. A cumulative reward greater than one means that the sequence of optimizations performed yields better results than LLVM's default optimizations. Let's run 100 random actions and see how close we can  get:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "7p_UzcyZTRL0"
   },
   "outputs": [],
   "source": [
    "env.reset(benchmark=\"benchmark://npb-v0/50\")\n",
    "episode_reward = 0\n",
    "for i in range(1, 101):\n",
    "    observation, reward, done, info = env.step(env.action_space.sample())\n",
    "    if done:\n",
    "        break\n",
    "    episode_reward += reward\n",
    "    print(f\"Step {i}, quality={episode_reward:.2%}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "U7CQAkTSTasn"
   },
   "source": [
    "Not bad, but clearly there is room for improvement! Because at each step we are taking random actions, your results will differ with every run. Try running it again. Was the result better or worse? Of course, there may be better ways of selecting actions than choosing randomly, but for the purpose of this tutorial we will leave that as an exercise for the reader :)\n",
    "\n",
    "Before we finish, lets use `env.commandline()` to produce an LLVM `opt` command line invocation that is equivalent to the sequence of actions we just run:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "E2whGdgKTiSQ"
   },
   "outputs": [],
   "source": [
    "env.commandline()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UaeTCnbZTmUe"
   },
   "source": [
    "We can also save the program in its current state for future reference:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "_dtaMwEKTehC"
   },
   "outputs": [],
   "source": [
    "env.write_bitcode(\"/tmp/program.bc\")\n",
    "\n",
    "!ls /tmp/program.bc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "uIC-ZxIeTqFH"
   },
   "source": [
    "Once we are finished, we must close the environment to end the compiler session:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Next Steps\n",
    "\n",
    "Now that you have got to grips with the compiler environment, take a browse through the [examples directory](https://github.com/facebookresearch/CompilerGym/tree/stable/examples) for pytorch integration, agent implementations, etc. Then check out [the leaderboards](https://github.com/facebookresearch/CompilerGym#leaderboards) to see what the best performing algorithms are, and [the documentation](https://facebookresearch.github.io/CompilerGym/) for details of the APIs and environments. We love feedback, bug reports, and feature requests - please [file an issue](https://github.com/facebookresearch/CompilerGym/issues/new/choose)!"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "name": "CompilerGym Getting Started.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python (compiler_gym)",
   "language": "python",
   "name": "compiler_gym"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
